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THE SCALING MEAN AND A LAW OF LARGE PERMANENTS 


JAIRO BOCHI, GODOFREDO IOMMI, AND MARIO PONCE 


Abstract. In this paper we study two types of means of the entries of a non¬ 
negative matrix: the permanental mean , which is defined using permanents, 
and the scaling mean , which is defined in terms of an optimization problem. We 
explore relations between these two means, making use of important results 
by Ergorychev and Falikman (the van der Waerden conjecture), Friedland, 
Sinkhorn, and others. We also define a scaling mean for functions in a much 
more general context. Our main result is a Law of Large Permanents, a point- 
wise ergodic theorem for permanental means of dynamically defined matrices 
that expresses the limit as a functional scaling mean. The concepts introduced 
in this paper are general enough so to include as particular cases certain clas¬ 
sical types of means, as for example symmetric means and Muirhead means. 
As a corollary, we reobtain a formula of Halasz and Szekely for the limit of the 
symmetric means of a stationary random process. 


1. Introduction 


1.1. Matrix means. Among the myriad notions of means that appear in mathe¬ 
matics, the arithmetic and the geometric means are the most important ones. In 
this paper we introduce two notions of mean of the entries of a matrix. Both no¬ 
tions are obtained by combination of the arithmetic and the geometric means. Let 
us define them now, leaving details and proofs for later. 

The first notion is the permanental mean of an n x n matrix A with nonnegative 
entries, which is defined as the n-th root of the arithmetic mean of the products 
of the diagonals of A. The reason for the name is that this mean is related to the 
permanent of A: 



We discuss the main properties of the permanental mean in § 2.1. Some classic 
types of means are particular cases of the permanental mean: see Section 5. 

Two matrices A and B of the same size are called scalings of one another if their 
entries are related by bij = Xia^yj for suitable vectors x, y with positive entries 
Xi , yj. If the geometric means of the entries of both vectors are 1, we say that B is 
a normalized scaling of A. Assuming A to have nonnegative entries, we define the 
scaling mean of A , denoted srn(A), as the infimum of the arithmetic means of all 
normalized scalings of A. It turns out that the scaling mean shares many properties 
with the permanental mean, and it can be characterized in several other ways: see 
§§ 2 . 2 - 2 . 3 . 
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This paper is devoted to the study of the relations between the scaling mean and 
the permanental mean in a range of different settings. 

If A is a doubly stochastic n x n matrix, i.e. the arithmetic mean of each row and 
each column is 1 /n, then by the celebrated van der Waerden conjecture (which was 
an open problem for more than 50 years until being proved in 1981), the permanent 
of A is at least n\/n n or, in other words, its permanental mean is at least 1/n. On 
the other hand, as a consequence of the AM-GM inequality (see § 2.2), the scaling 
mean of A equals 1/n, and therefore 

pm(A) ^ sm(A) (1.1) 

for doubly stochastic matrices. It transpires that this inequality holds true for all 
nonnegative square matrices. This generalization of the van der Warden conjecture 
is actually a consequence of it, together with the fact that “most” nonnegative 
matrices (in particular all positive ones) are scalings of doubly stochastic matrices, 
a discovery that dates back from the 1960’s with the works of Sinkhorn and others. 
Details are provided in §§ 2.3-2.5, where we also characterize the cases of equality in 
(1.1). Moreover, and more importantly for the purposes of this paper, approximate 
equality holds in (1.1) for certain “repetitive” large matrices: The precise statement 
is given in § 2.6 and is basically a reinterpretation of a result by Friedland, who in 
1979 proved an asymptotic weaker version of the van der Waerden conjecture. That 
fact is one of the foundations for the main result of this paper, which we describe 
next. 


1.2. The general setting for scaling problems. Scaling problems have been 
studied in infinite dimensions, for infinite matrices and for functions. In this paper 
we introduce a more general and abstract setting that includes the above ones as 
particular cases and opens a door for other possibilities. Basically, we consider 
measurable functions on arbitrary probability spaces, and the allowed scaling func¬ 
tions are those that are measurable with respect to certain fixed sub-a-algebras; 
see Section 3 for details. 

We extend the definition of scaling mean to this general setting. We also prove 
the existence of “doubly stochastic” scalings under certain boundedness assump¬ 
tions, thus extending previously known facts about the so-called Sinkhorn decom¬ 
positions for matrices or DAD problems for functions. 


1.3. A Law of Large Permanents. On the other hand, it is not clear what should 
be the permanental mean of an infinite matrix. A natural attempt is to consider 
square truncations and then take the limit of the corresponding permanental means 
as the size of the square tends to infinity, provided of course that such limit exists. 
We prove that this is indeed the case for some classes of dynamically defined ma¬ 
trices. Moreover, we identify the limit permanental mean as the scaling mean of 
the function that controls the distribution of the matrix entries. The precise result 
is as follows: 

Theorem 4.1. Let (A, p,), ( Y , v) be Lebesgue probability spaces, letT: X —> X and 
S'.Y^Y be ergodic measure preserving transformations, and let f: X x Y —* IR 
be a positive measurable function essentially bounded away from zero and infinity. 
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Then for y, x u-almost every (x, y) e X x Y, the permanental mean of the matrix 


(f(x,y) 

f(Tx,y) 

• f(T n ~ 1 x, y) \ 

f(x, Sy) 

f(Tx, Sy) 

■ f(T n ~ 1 x, Sy) 

\f(x, S n ~ 1 y) 

f(Tx, S n ~ 1 y) • 

■ f(T n ~ 1 x, S n ~ 1 y)J 


converges as n —* oo to the scaling mean of the function f. 

In this fairly general pointwise ergodic theorem we not only prove the almost 
everywhere convergence but we also identify the limit, thus completely solving the 
two main questions that arise when considering ergodic averages. 

Several results scattered in the literature are contained in this Law of Large 
Permanents. For example, the aforementioned result of Friedland is a corollary. 
The Law is flexible enough so that we can deduce from it variants of Birkhoff’s 
ergodic theorem where the arithmetic means are replaced by other types of means. 
For example, in § 5.1 we deduce an ergodic theorem for symmetric means, thus 
reobtaining in a more transparent way a formula due to Halasz and Szekely [1 IS]. 

Let us remark that the literature contains asymptotic results about permanents 
of random oblong (i.e. non-square) matrices, a subject we will not deal with: see 
[RW] and references therein. There are fewer results for square matrices: we can 
only cite [TV]. 

1.4. Organization of the paper. In Section 2 we study permanental and scaling 
means of matrices, proving the properties mentioned above, among several oth¬ 
ers. In Section 3 we introduce an abstract setting for scaling problems, where we 
define scaling means for functions and also prove an existence theorem for func¬ 
tional scaling. Section 4 is devoted to the proof of the Law of Large Permanents 
(Theorem 4.1). In Section 5 we present some corollaries concerning symmetric and 
Muir head means. Section 6 discusses some of the questions arising from our results, 
and poses some conjectures. 

Throughout this paper we use the following notations: 

K+ := [0, oo), D? ++ := (0, oo). 

The set of real (resp., nonnegative, positive) m x n matrices is denoted by D? mxrl 
(resp., K™ xn , K™ x ")- 

2. Permanental and scaling means of matrices 

2.1. The permanental mean. The permanent of a square matrix A = (a^) e 
[R raxra is defined as 

n 

per A : = ’ 

(7 i= 1 

where a runs on the permutations of {l,...,n}. This function was introduced 
independently by Cauchy and Binet around 1812. It has many applications in 
combinatorics [vLW], probability [Ba3], among other areas. The book [Mi] is wholly 
dedicated to permanents and contains historical information and the most relevant 
results and applications up to the late 1970’s. 

As the determinant, the permanent is a symmetric multilinear function of the 
rows (or columns) of a matrix; it is also invariant under transposition. Despite 
the similarities between the definitions of determinant and permanent, there is no 
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permanental analog of the Gaussian elimination algorithm, and indeed the evalua¬ 
tion of the permanent is a computationally much more complex problem: see [BR, 
pp. 245ff]. 

When dealing with nonnegative matrices, the following function is in some senses 
better behaved than the permanent itself: 

Definition 2.1. The permanental mean of a square nonnegative matrix A e D?™ x " 
is defined as 



The permanental mean has the following properties, which allow us to consider 
it as a sort of average of the entries of the matrix: 

• Monotonicity: the permanental mean is increasing as a function of each of 
the entries of the matrix. 

• Reflexivity: a matrix with constant entries has a permanental mean equal 
to this constant. 

These two properties imply the following one: 

• Internality: the permanental mean is between the minimum and the max¬ 
imum of the entries of the matrix. 

Some additional properties are: 

• Continuity. 

• Row-wise and column-wise homogeneity : the permanental mean on D?” xn 
is homogeneous of degree 1/n as a function on each row and each column, 
and in particular it is a homogenous function of degree 1. Equivalently, if 
A is a nonnegative square matrix and D is a diagonal matrix of the same 
size and with positive main diagonal, then 


pm(AD) = pm(DA) = gm(D) pm(A), 


where gm(D) denotes the geometric mean of the entries along the main 
diagonal of D. 

• Row-wise, column-wise, and transpositional symmetry: the permanental 
mean is invariant under permutations of rows or columns, and under trans¬ 
position of the matrix. Equivalently, if A is a nonnegative square matrix 
and P is a permutation matrix of the same size then 


pm(df’) = pm(PA) = pm(A T ) = pm(A) . 


A square matrix with nonnegative entries is called doubly stochastic if the sums 
of the entries on each row and each column are equal to 1. The set of n x n 
doubly stochastic matrices is denoted by fi n . By a classical theorem of Garrett 
Birkhoff, Q n is a convex polytope whose vertices are the permutation matrices: see 
e.g. [BR, MM, Mi]. A great deal of work has been devoted to study the permanent 
of this kind of matrices, the van der Waerden conjecture being probably the most 
relevant topic. If S e f l n then 


( 2 . 1 ) 


— ^ per S < 1. 
n n 


The upper bound is trivial, while the lower bound was conjectured by van der Waer¬ 
den [vdW] in 1926 and proved in 1981 independently by Egorycliev [Eg] and Fa- 
likman [Fa], Moreover, the minimum of the permanent on is attained at the 
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matrix J n all of whose entries are equal to 1 jn and the maximum is attained at 
permutation matrices. In terms of the permanental mean, theses bounds become: 

- < Pm(S) ,| 1/n if S 6 (2.2) 

n (n!) 1 '" 

2.2. The scaling mean. In the following definition, we identify each x e R n with 
a column vector, and denote by gm(i) denotes the geometric mean of its entries. 


Definition 2.2. The scaling mean of a nonnegative matrix A e 


is 


sm(A) := -inf 


x J Ay 


mn x,y gm(x) gm(y) ’ 


(2.3) 


where x runs on R™ + and y runs on 


Though the definition above also applies to oblong matrices, in the sequel we 
will mostly consider square matrices. 

As the permanental mean, the scaling mean has the properties of monotonicity, 
reflexivity (a consequence of the AM-GM inequality), and internality, which is why 
we call it a “mean”. 

The scaling mean also has the properties of row-/column-wise homogeneity and 
row-/column-wise symmetry; these follow from the analogous properties of the geo¬ 
metric mean. Transpositional symmetry obviously holds. Other useful properties 
are given by the following three propositions: 


Proposition 2.3. The function srn: R™ x " — * R + is concave and continuous. 


Proof. Being defined as the infimum of a family of linear functions, the scaling mean 
is a concave and upper semicontinuous function. Let us prove lower semicontinuity 
at an arbitrary fixed A = (a y ) e R™ x ". Let H be the set of nonnegative matrices 
that have the same pattern of zeros as A, i.e., 

H := {( by) 6 R? 1 *" : bn = 0 aij = 0} . 

Since H is convex and relatively open with respect to its affine hull in R mX71 , it 
follows from concavity (see [ to, Thrrn. 10.1]) that the restriction of the function 
sm to H is continuous. Consider a sequence (Ak = (fflfe.ij)) in R™ xn converging to 
A. Define Bk = ( ak,ij) by 


bk,ij 


Ok,ij If Qij ^ 0, 

0 otherwise. 


By monotonicity, sm (Bk) < sm(Afc). Moreover, Bk —> A and Bk e H for large 
enough k. Therefore liminf sm(Afc) ^ limsm(f3fc) = sm(A), thus establishing lower 
semicontinuity. □ 


Recall that fl n a R™ xra is the set of doubly stochastic matrices. 

Proposition 2.4. If Ae f l n then sm(A) = 1/n. 

Proof. Let A e f l n . For all x, y e R" + , by the weighted AM-GM inequality, 

fAy = Xj -yi ^ I | '' = gm(x) gm (y). 

i,j i,j 

Equality is attained if x = y = say. Therefore sm(A) = 1/n. □ 
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Proposition 2.5. If A is nonnegative square matrix of block triangular form A = 
(qc) where B and C are square matrices, then srn(A) does not depend on R. 

Proof. Suppose A, B , and C have sizes n x n, k x k, and t x I respectively, so 
n = k + t. For all u e R ++ , consider the matrix: 


D(u) := diag (u^ k ,... ,u^ k ,u 1( 

\__ S V___ 

-V- V* 

k t 

Given x, y e R" + , let x u := D(u)x and y u := D(u~ 1 )y. Then 
gm(a; u ) = gm(i), gm(y u ) = gm(y), and x J u Ay u = x J 


B 

0 


u i/k+i/( R \ 

c ) y - 


Therefore 

1 . f x J u Ay u = 1 x J A 0 y 
Sm " n 2 gm(x u ) gm(y u ) n 2 gm(x) gm (y) ’ 

where A 0 is the matrix obtained from A by replacing R with 0. Taking infimum 
on x, y , we obtain sm(A) < sm(A 0 ). On the other hand, sm(A) ^ sm(A 0 ) by 
monotonicity, thus concluding the proof. □ 


Remark 2.6. As a consequence of the AM-GM 


z e 


inf 

xeR£ + gm(a;j 


Therefore (2.3) can be rewritten as: 


inequality, we have: 
= ragm(z). 


deR™ XB => sm(A) = — inf gm ^ (2.4) 

mye R“ gm (y) 

As immediate consequences of this formula, we obtain the following additional 
properties of the scaling mean: 


A e R™ xn , B e R” xr 

=> sm(Al?) $s n sm(A) sm(.B), 

(2.5) 

deRf" 

p(A) 

=> sm(A) < , 

(2.6) 


where p denotes spectral radius. See also Remarks 2.12 and 2.21. 

Remark 2.7. The expressions after the inf’s in (2.3) and (2.4) are log-log-convex 
functions with respect to ( x , y) and y , respectively, so the computation of the scaling 
mean from either formula is equivalent to a convex minimization problem. 


Remark 2.8. One may ask if the permanental mean also has the concavity prop¬ 
erty obtained for the scaling mean in Proposition 2.3. The answer is no, because 
otherwise the minimum of permanental mean on the convex polytope would be 
attained at the boundary, while we know by the Egorychev-Falikman theorem that 
this is not the case. On the other hand, the permanental mean is indeed concave 
among positive definite symmetric matrices: see [Bh, p. 282]. 
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2.3. Matrix scaling and Sinkhorn decompositions. Recall from the introduc¬ 
tion that two matrices A, B e FS mxn are called scalings of one another if there 
exist diagonal matrices E e R mXm , D e IR raxn with positive main diagonals such 
that A = EBD. Matrix scaling has applications in numerical analysis [GvL] and 
economics [LS] . We will be especially interested in scaling when one of the matrices 
is doubly stochastic, so we introduce the following: 


Definition 2.9. A Sinkhorn decomposition of a square matrix A is a factorization 
of the form 


A = DSE 


where 


S is doubly stochastic, 

D , E are diagonal with positive main diagonals. 


In 1964, Sinkhorn [ il] proved that any positive square matrix can be scaled to a 
doubly stochastic matrix and moreover the corresponding Sinkhorn decomposition 
is unique up to multiplication of D and E by positive factors A and A -1 , respectively. 
Not all nonnegative square matrices possess Sinkhorn decompositions, however; an 
example is (J \) ■ 


The following proposition relates the previously defined means and Sinkhorn 
decompositions. Recall that if D is a diagonal matrix with positive entries along 
the main diagonal then gm(TA) denotes the geometric mean of these entries. 


Proposition 2.10. If Ae R" xn has a Sinkhorn decomposition A = DSE then 

sm(A) = - gm(£>) gm (E) = - Pm ^| . (2.7) 

n n pm(£>) 

Moreover, if x and y are the vectors whose entries form the diagonals of D~ x and 
E~ x , respectively, then the infimum in (2.3) is attained at x, y. 


Proof. The first part is an immediate consequence of Proposition 2.4 and the row- 
/column-wise homogeneity of the scaling and the permanental means. If x and y 
are as described, then 

1 x 1 Ay 1 . _ . _. 

—- -T\ -/ 7 = - S m (D) gm (E) , 

n z gm(r) gm(i/) n 

so the second part follows from the first one. □ 


Example 2.11. The Sinkhorn decomposition of a 2 x 2 positive matrix 

0 

0 



/ V ad I \ ~ 

'Jed 


l 


Jab 


and therefore the scaling mean is 


( V ad Jbc \ 

V ad+Jbc V ad+Jbc 1 

Jbc V ad I 

J ad+Jbc J ad+Jbc / 


sm 


a b 
c d 


yfad + yfbc 



is given by 



By continuity (Proposition 2.3), this formula also holds for nonnegative matrices. 

Remark 2.12. Relations (2.4) and (2.6), once rewritten using the first equality from 
(2.7), were stated explicitly and proved by London [Lc]. 
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2.4. Existence of Sinkhorn decompositions. We need some definitions: 

• Two matrices of the same size have the same zero pattern if their zero 
entries occupy the same positions. 

• A diagonal of a square matrix is a sequence of entries containing exactly 
one entry from each row and one entry from each column; a diagonal is 
called positive if each of its elements is positive. 

• Two matrices A, B e R mx " are p ermu t a tionally equivalent if there exist 
permutation matrices P e R mXm , Q e R" xn such that B = PAQ. 

• A matrix A e R raxra called fully indecomposable if either n = 1 and A =£ 0 
or n ^ 2 and A is not permutationally equivalent to a matrix of the form 
(qq), where B and C are square matrices. 

• The direct sum of two square matrices A, B is the matrix g); this is an 
associative (but non commutative) operation. 

Theorem 2.13 (Perfect-Mirsky [ ]). Let A e R” xn . The following assertions 

are equivalent: 

(a) A has the zero pattern of some doubly stochastic matrix; 

(b) every positive entry of A lies on a positive diagonal; 

(c) A is permutationally equivalent to a direct sum of fully indecomposable ma¬ 
trices. 

Moreover, if n ^ 2 then the assertions above are equivalent to: 

(d) A is not permutationally equivalent to a matrix of the form (qq), where 
the matrices B and C are square and R is nonzero. 

Definition 2.14. Let V n be the subset of R” xri formed by matrices that satisfy 
any of the equivalent conditions in Theorem 2.13. 

The following theorem provides still another equivalent definition for the set V n '■ 

Theorem 2.15 (General Sinkhorn decompositions). A matrix A e R" x " has a 
Sinkhorn decomposition A = DSE if and only if A e V n , and in that case the 
doubly stochastic matrix S is unique. Moreover, the map S is continuous. 

The first part of the theorem was proved independently and by means of dif¬ 
ferent arguments by several authors [BPS, SK, MO]. Continuity was proved by 
Sinkhorn [Si2] (see also [Me2]). Since these early papers, many other proofs, gen¬ 
eralizations, and numerical studies have appeared in the literature. 

Some of the proofs of existence of Sinkhorn decompositions are closely related to 
the scaling mean (and indeed motivated our Definition 2.2), so let us sketch them. 

In order to prove that matrices in V n have Sinkhorn decompositions, it is suffi¬ 
cient to consider fully indecomposable matrices A , because then the general case fol¬ 
lows by characterization (c) in Theorem 2.13. In that case, Marshall and Olkin [MO] 
have shown that the function after the inf in (2.3) diverges to infinity as ( x , y) ap¬ 
proaches the boundary of the cone R™ + x R" + . In particular, the infimum is attained 
at some (x, y) in the interior of the cone. Then a Lagrange multipliers calculation 
shows that: 
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and so A = DSE is the sought after Sinkhorn decomposition. Notice that this 
strategy is basically a converse of Proposition 2.10. Djokovic [ )j] and London [Lo] 
independently discovered an analogous proof based instead on the minimization 
problem (2.4). For a related discussion, see [Ba2, § 2]. 

Let us introduce a technical device that will be useful later. Given A e IR" xn , 
let 11(A) denote the matrix obtained by keeping all entries that lie on positive 
diagonals, and setting the remaining entries to zero. The map II is a projection, 
i.e. II o II = II, and although it is discontinuous it has useful properties: 

Proposition 2.16. If Ae IR" xrl then: 

(a) 11(A) eP„u {0}/ 

(b) 11(A) = 0 iff per A = 0; 

(c) pmll(A) = pm(A); 

(d) sm 11(A) = srn(A). 

Proof. Property (a) follows from characterization (b) in Theorem 2.13, proper¬ 
ties (b) and (c) follow from the definition of the permanent, and property (d) follows 
from from characterization (d) in Theorem 2.13, the row-/column-wise symmetry 
of the scaling mean, and Proposition 2.5. □ 

2.5. Comparison between the two means. We have the following inequalities 
between the two means we have defined: 

Theorem 2.17 (Generalized van der Waerden bounds). For all A e IR" xrl , 

sm(A) ^ pm(A) < ^(n!) -1 /" sm(A). (2.8) 

Equality holds in the first inequality iff A has permanent 0 or rank 1. Equality 
holds in the second inequality iff A has permanent 0 or has the zero pattern of a 
permutation matrix. 

The sequence (n(n!) _1 ' /rl ) is increasing, and by Stirling’s formula it converges 
to e; in particular: 

sm(A) ^ pm(A) < e sm(A) for all nonnegative square matrices A. (2.9) 

Concerning the cases of equality in Theorem 2.17, we recall that by the Frobenius- 
Konig theorem [MM, Mi], a matrix in IR" xra has zero permanent if and only if it 
contains a zero submatrix of size r x s with r + s = n + 1. 

Notice that if A e fl n then, by Proposition 2.4, inequalities (2.8) are equiva¬ 
lent to (2.1); in particular, the first inequality in (2.8) is a generalization of the 
van der Waerden bound to general nonnegative matrices. Actually, (2.8) is a corol¬ 
lary of (2.1), as we proceed to show. 

Proof of Theorem 2.17. First consider A e V n ■ By Theorem 2.15, A has a Sinkhorn 
decomposition DSE. The matrix S obeys the bounds (2.1) or equivalently (2.2), 
and so using Proposition 2.10 we obtain (2.8). The first inequality in (2.8) is an 
equality iff A = D~ x JnE -1 , that is, iff A is a positive matrix of rank 1. The second 
inequality is an equality iff A = D~ 1 PE~ 1 for some permutation matrix P, that 
is, iff A has the zero pattern of a permutation matrix. 

Since V n is dense in IR" X ", by continuity (Proposition 2.3) we conclude that 
inequalities (2.8) hold for every A. 

The stated conditions for equality are clearly sufficient, so let us check necessity. 
Consider A such that sm(A) = pm(A). By Proposition 2.16, smll(A) = pmll(A) 
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and moreover there are two possibilities: either 11(A) = 0 or 11(A) e V n ■ In the 
first case, A has permanent 0. In the second case, as we have seen above, 11(A) is 
a positive matrix of rank 1; in particular A = 11(A) has rank 1. This proves the 
characterization of the first equality. The second one is dealt with analogously. □ 

Let us review a few other results of the literature from our perspective. The 
paper [LSW] basically shows that the scaling mean can be computed efficiently, 
and therefore by (2.9) the permanental mean can be efficiently computed up to a 
factor of e, or equivalently, the permanent of a nonnegative n x n matrix can be 
efficiently computed up to a factor of e n . Using instead the Bethe approximation 
for the permanent, it is possible to improve this factor to 2 n : see [GS]. We will see 
next further advantages of the scaling mean for the approximation of permanents. 


2.6. The scaling mean as a limit of permanental means. Recall that the 
Kronecker product of two arbitrary matrices is defined as: 

( anB ... a\ n B\ 

. : e R mrx " s if A = (ay) 6 R mx ”, B e K rxs . 

Q'miB ... a mn BJ 

It satisfies the following mixed-product property (see [MM, p. 9]): 

{A®B){C®D)= {AC)®{BD). (2.10) 

Notice that the map II considered in Proposition 2.16 has the following additional 
property: 

U(A® B) = 11(A) ® B if B is a positive square matrix. (2.11) 
The scaling mean is well behaved with respect to the Kronecker product: 
Proposition 2.18. If A, B are nonnegative square matrices then 

sm(A(x) B) = sm(A) sm(I?). 


Proof. It is possible to prove the proposition directly from the definition of scaling 
mean, but let us give an alternative argument. By continuity, it is sufficient to 
consider matrices A e R” xn , B e Wf' xm that have Sinkhorn decompositions, say 
A = DSE , B = D'S'E'. Then A® B also has a Sinkhorn decomposition, namely, 

A®B= (D ® D')(S® S')(E ® E '), 


so it follows from Proposition 2.10 that 


sm(A 0 B ) 


gm(D ® D') gm (E ® E') 
nm 

gm(.P) gm(E’) _ gm(D') gm(E') 
n m 

sm(A) sm(B). 


□ 


The property expressed by Proposition 2.18 fails for permanental means: con¬ 
sider identity matrices, for example. On the other hand, Brualdi [Br] formulated a 
conjecture that can be restated in terms of permanental means as follows: If A, B 
are nonnegative square matrices then 

pm(A ® B) ^ pm(A) pm(I?). 


( 2 . 12 ) 





THE SCALING MEAN AND A LAW OF LARGE PERMANENTS 


11 


Based on experimental evidence, we also conjecture that equality holds iff A or B 
have permanent 0 or both A and B have rank 1. Unfortunately Brualdi’s conjecture, 
formulated around 50 years ago, apparently has not received much attention. 

We now describe an important relation between the permanental and the scaling 
means. Prior to the proof of the van der Waerden conjecture, Friedland [ ] proved 

that the permanent of each S e is at least e -n , which by the Stirling formula 
differs from the van der Waerden lower bound by a subexponential factor. The 
crux of his proof is to obtain the following limit formula: 

lim (per(S0 J TO )) 1/m = e~ n if S e (2.13) 

m —>oo 

where J m is the m x m matrix all of whose entries are 1/m. We call this fact 
the Friedland limit ; it appears in [ ] as formula (1.6) and, as explained in § 2 of 
that paper, follows relatively easy from the then unproved van der Waerden bound. 
Basically as corollary of Friedland’s limit, we will obtain the following: 

Theorem 2.19 (Generalized Friedland limit). For any nonnegative square matrix 
A we have 

sm(A) = lim pm(A 0 U m ) , (2.14) 

m —>oo 

where U m is the m x m matrix all of whose entries are 1. 

Remark 2.20. The generalized van der Waerden bound (i.e., the first inequality in 
Theorem 2.17) becomes a corollary of Theorem 2.19 if Brualdi conjecture (2.12) is 
assumed. 


Proof of Theorem 2.19. Consider S e fl n . By Stirling formula, (m!) 1 /™ = e 1 m.6 m , 
where 9 m —» 1, so 


pm(S (x) U m ) 


m pm/S 1 (x) J m ) = m 


/ per(S 0 Jm) 
\ ( nm )! 


l/nm 


(per(S 0 J m )) 1/r ‘ 

o-i, 


which by Friedland’s limit (2.13) converges to 1/n. Since sm(S) = 1/n, we conclude 
that (2.14) holds for doubly stochastic matrices. Next consider a matrix A in the set 
V n (recall Definition 2.14), and let DSE be its Sinkhorn decomposition. Using the 
mixed-product property (2.10) we factorize A®U m as {D 0 I m ){S®U m )(E 0 I m ), 
where the matrices D 0 I m and E® I m are diagonal. Therefore 


pm(A 0 U m ) = gm (D 0 I m ) pm(S 0 U m ) gm (E 0 I m ) -» sm(A), 

V -v--v--V-' 

=gm (D) ->l/m =gm(E) 


establishing (2.14) in this case. Finally, consider a general A e R" x ". Using 
Proposition 2.16 and property (2.11) we obtain 

pm (A® U m ) = pm(n(Al 0 U m )) = pm(II(A) 0 Um) —- sm(n(A)) = sm(A). □ 


As mentioned in the end of § 2.5, the scaling mean is computationally easier to 
compute than the permanental mean, and in view of the bounds of Theorem 2.17, 
the former can be used to approximate the latter up to a multiplicative error of at 
most e. Theorem 2.19 indicates that for large matrices of the form 40 U m , this 
factor is actually close to 1. 

As we will see later, some classes of random matrices are approximately permu- 
tationally equivalent to matrices of the form 40 U m '■ this is the idea behind the 
Law of Large Permanents. While in this paper we are not concerned with numerical 
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studies, it may be interesting to investigate further the effectiveness of the scaling 
mean for the computation of permanents of reasonably well-behaved matrices. 

Remark 2.21. In addition to (2.3), (2.4), (2.7), and (2.14), other characterizations 
of the scaling mean are [FLS, Thrm. 5.3], [GS, formula (2)] and the following one: 

sm(M) = — inf ^ | , (2-15) 

v ’ n A gm(A) 

where A runs over the diagonal matrices with positive main diagonal. We only 
sketch the proof: The ^ inequality follows from (2.6) and row-wise homogeneity of 
the scaling mean. In the case that A e V n , i.e. A has a Sinkhorn decomposition 
DSE, the infimum on the RHS is attained at A = E~ 1 D~ 1 . Recall that the LHS 
of (2.15) depends continuously on A, while the RHS is upper semicontinuous and 
monotonically increasing with respect to matrix entries. (The latter fact follows 
from monotonicity of p 1 itself a consequence of Gelfand spectral radius formula [Bli, 
p. 204].) This permits us to extend the equality from the dense set V n to the whole 

||nxn 


3. Scaling mean and Sinkhorn decomposition of functions 

In this section we extend the notions of scaling mean and Sinkhorn decomposition 
from matrices to arbitrary nonnegative measurable functions defined on probability 
spaces with respect to a pair of sub-cr-algebras. The allowable scalings are those that 
are measurable with respect to one of these sub-cr-algebras, which thus supersede 
the row/column structure. 

Moreover we prove the existence of this functional Sinkhorn decomposition under 
a boundedness assumption (Theorem 3.6). 

Finally, in § 3.5 we analyze the particular setting of direct products, which is 
the one that we will use later for our Law of Large Permanents (Section 4) and for 
its applications (Section 5). Let us remark that though this particular setting is 
sufficient for the main results of this paper, the general setting where scaling are 
controlled by sub-cr-algebras is in our opinion more transparent and, as discussed 
in Section 6, should be indispensable for stronger Laws. 

3.1. The functional scaling mean. Let us fix a probability space (f2, M, P). Let 
fy(P) denote the set of positive measurable functions <p: If —» P++ such that logtyj e 
L 1 (P). The geometric mean of a function / e C?(P) is defined as: 

gm(tp) := exp (J log</?dP) . (3.1) 

The AM-GM inequality says that gm(i^) $ <^dP. 

Let us also fix a pair of sub-cr-algebras Mi, A 2 <= A. For each i e {1, 2}, we define 
two sets of functions Gi 3 Bi as follows: 

Qi := {</ 5 : il —» P ++ : <p is Mi-measurable and logy; e L 1 (P)} , (3.2) 

Bi := {(£>: —> P ++ : ip is Mi-measurable and logt^ e L°°(P)} . (3.3) 

Our scaling functions will always take values in these spaces. 

Definition 3.1. Let /: —> P + be a nonnegative measurable function. The scaling 

mean of / with respect to the sub-cr-algebras Mi, M 2 is defined as: 

sm Ai,A 2 {f) : = mf - 7 -^r-yw f 31/52 dP . 

9ieGi gm( ff i) gm( ff2 ) J 

92^G2 


(3.4) 
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When no confusion is likely to arise, we write sm(/) = sm^ 1 ,a 2 (/)- 

Concrete examples will be presented later in § 3.5. 

We list some basic properties of the scaling mean: 

• Monotonicity: If / ^ g almost everywhere then sm(/) < sm(g). 

• Reflexivity: If / equals a constant c almost everywhere then sm(/) = c; 
this is a consequence of the AM-GM inequality. 

• Homogeneity: If g\ e Q\ and g 2 e Q 2 then 

sm(gi/g 2 ) = gm(gi) sm(/) gm(g 2 ) • (3.5) 

The following proposition says that in order to evaluate the infimum in formula 
(3.4) it is sufficient to consider g t in the smaller space Bi defined by (3.3): 


Proposition 3.2. For any measurable f: SI 
sm(/) := inf 


gieBi gm(gi) gm(g 2 ) 
32 602 


I 


we have: 
gif92 dP. 


Proof. Fix / ^ 0 and consider g\ e Qi, g 2 e Q 2 such that Jgi/g 2 dP < co. Define 
two sequences (gi,fe), (g 2j k) respectively in B±, B 2 as follows: 


5i,fc( w ) := 


\g z (uj) if | loggi(w)| k, 

I 1 otherwise. 


Applying the dominated convergence theorem three times, we conclude that: 


1 


I 


gm(gi) gm(g 2 ) 
The proposition follows. 


gif92 dP = linr 


1 


fc^oo gm(gi ife )gm(g 2 


J 9l,kfg2,k dP • 


□ 


3.2. Conditional expectations and doubly stochastic functions. We begin 
by recalling some basic facts about conditional expectations, referring the reader 
to [Bo] for more details. 

Let (SI, A, P) be a probability space. If / e L 1 (P) and A\ is a sub-cr-algebra of 
A, let E(/|Ai) denote the conditional expectation of / with respect to Ai, i.e. the 
a.e.-unique Ai-measurable function such that 

Jg/dP = JgE(/[Ai) dP for every bounded Ai-measurable function g. 

For example, if {B 1 ,... B^} is a partition of SI into finitely many sets of positive 
probability and A\ is the er-algebra generated by this partition, then E(/|Ai) is the 
simple function that takes the value p^g-y \ B /dP on the set Bi. 

Note that if / e L 1 (P) then, as an immediate consequence of the definition of 
conditional expectation, 

E(gf\Ai) = gE(f\Ai) for every bounded Ai-measurable function g. (3.6) 


Again fix a probability space (SI, A, P) and a pair of sub-cr-algebras Ai, A 2 c: A. 
Let us say that a integrable nonnegative function /: SI —> P+ is doubly stochastic if 

E(/|Ai) = E(/|A 2 ) = 1 P-a.e. (3.7) 

Notice that a doubly stochastic function has arithmetic mean $/dP = 1. Analo¬ 
gously for the scaling mean: 
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Proposition 3.3. If f is doubly stochastic then sm(/) = 1. 

Proof. If / is doubly stochastic then the measure du := / dP is a probability. There¬ 
fore, for any gi e Bi and g 2 e B 2 , using the AM-GM inequality and property (3.6) 
we obtain: 

1 5 i /52 dP = $.9132 dv 

^ exp (J log(giff 2 ) dv) 

= ex P (I / 1°S 5i dP) exp ($ / log g 2 dP) 

= exp ($E(/log gi\Af) dP) exp (jE(/log 52 |A 2 ) dP) 

= exp (J (loggi)E(/|Ai) dP) exp ($ (log 5 2 )E(/|A 2 ) dP) 

= gm(5i)gm(5 2 ). 

So it follows from Proposition 3.2 that sm(/) ^ 1. Considering g\ = g 2 identically 
equal to 1 we conclude that sm(/) = 1. □ 

3.3. Functional Sinkhorn decompositions. Fix a probability space (12, A, P) 
and a pair of sub-cr-algebras A\, A 2 a A. 

Definition 3.4. A Sinkhorn decomposition of a function /: 12 —» P + is a factoriza¬ 
tion of the form 

/(w) = <p{w)g(uj)il>(u) 

where g is doubly stochastic, <p e Qi, and ip e Q 2 . 

Again we postpone the examples to § 3.5. 

Let us relate the scaling mean and Sinkhorn decompositions: 

Proposition 3.5. If f has a Sinkhorn decomposition ipgip then 

(a) sm(/) = gm(ip) gm(ip). 

(b) The infimum in (3.4) is attained at the functions <71 = l/tp, 52 = l/V- 

Proof. Part (a) is an immediate consequence of Proposition 3.3 and the homogene¬ 
ity property (3.5). Part (b) follows from part (a). □ 

Our next result gives a sufficient condition for the existence of Sinkhorn de¬ 
compositions. We denote by £>(P) the space of positive functions / such that 
log / e L°°(P). 

Theorem 3.6. Every f e £>(P) has a Sinkhorn decomposition ipgip where ip e B\ 
and ip e B 2 . Conversely, if f = ip'g'ip' is another Sinkhorn decomposition such that 
p' e B\ and ip' 6 B 2 then there exists A e P ++ such that ip' = A tp, g' = g, and 
ip' = A ~ x ip P -almost everywhere. 

The proof of this theorem is independent of the rest of the paper and is pre¬ 
sented in the next subsection. The theorem itself (in the particular case of directed 
products: see § 3.5) will be used in the proof of our Law of Large Permanents 
(Theorem 4.1) in Section 4. 
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3.4. Proof of Theorem 3.6. The proof has two steps. First, we reduce the prob¬ 
lem to the existence of a fixed point for a certain nonlinear operator. Then we 
show that this operator contracts Hilbert’s projective metric, and so it must have 
a fixed point. These ideas come from the literature. The fact that the existence 
of Sinkhorn decompositions for matrices is equivalent to a fixed point problem was 
first noted by Menon [Me ]. The usefulness of Hilbert’s projective metric in this 
context was noted independently in [FL] (for matrices) and [Nu! ] (for matrices and 
for functions). Unfortunately this elegant strategy uses strict positivity in an essen¬ 
tial way, and in order to study more general functions other methods are needed: 
see [Nu2, BLN], 


Fix a probability space (fi,.A, P), a pair of sub-cr-algebras Ai, A 2 c: A. and a 
function / e £>(P). Let B\, B 2 c= B(P) be as in (3.3). Define four maps forming a 
(non-commutative) diagram: 


Bi » Bi 


Ki 

B 2 




i 2 


k 2 

4 

b 2 


by the formulas: 

(IiM)M := 

<P\u) 


(Ki(p))M := E(ipf\Ai)(uj). 


Let T := K 10 I 20 K 20 U, i.e., the map obtained by going around the diagram. Notice 
that T maps rays (i.e. sets of the form P ++ </?) into rays. The following observation 
says that fixed rays yield Sinkhorn decompositions: 


Lemma 3.7. Suppose p £ B\ and c £ IR++ are such that T(p) = cp P-a.e. Then 
c = 1 and moreover, letting ip := K 2 o Ii(^) and g := f /(pip), the factorization pgip 
is a Sinkhorn decomposition of f. Conversely, if pgip is a Sinkhorn decomposition 
of f with p e B± and ip £ B 2 then T(p) = p P-a.e. 

Proof. Suppose T(y>) = cp P-a.e., and let ip and g be as above. Then the following 
equalities hold P-a.e.: 

Hg\A 2 ) = E(^j|- 4 2 ) (by definition of g) 

= -bE(^|„4 2 ) (by property (3.6) and the fact that ip £ B- 2 ) 

= 1 (by definition of ip). 

Similarly, the relation Ki o I 2 (ip) = C( P implies that 

E( 5 |A) = c P-a.e. 

Integrating with respect to P yields that c = 1, and therefore g is doubly stochastic, 
as we wanted to show. The converse part of the lemma is immediate. □ 


Hence to complete the proof of Theorem 3.6 we are left to show that T has a 
fixed ray. We will use a classical geometric-analytical device. 

Given two functions p, p £ Bi, we define their Hilbert distance as: 

ess sup(y>/(/?) 
ess mi(p/p) 



d(p, p) := log 


(3.8) 
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This is a pseudometric such that d(ip, (p) = 0 iff (p = ap P-a.e. for some constant 
c e P++. In other words, if elements of B\ that coincide a.e. are considered equal 
then d induces a genuine metric on the quotient space £>i/P ++ . Moreover, this 
metric is complete. Analogously, we define a pseudometric on £> 2 , which we also 
denote by d. Definition (3.8) is only a particular case of Hilbert’s projective metric 
on convex cones; see e.g. [ Vul] or [ Li, Section 1], In these references the reader will 
find an important property due to Garrett Birkhoff [Bi], which specialized to our 
case is stated as follows: 


Proposition 3.8. If L: B\ —> £>2 is linear and its image has finite diameter <5 then 
the following contraction property holds: 

d(L(y>), L(<£)) ^ Manh — J d(ip,tp) for all ip, dp e B\. 


Proof of Theorem 3.6. The linear map K 2 (y>) := E(tpf\A 2 ) defined before satisfies 
the inequalities 


(essinf /) E{ip\A 2 ) < K 2 (y>) ^ (esssup/) E(p\A 2 ) P-a.e. 


Analogous inequalities for (p yield 

( ess inf / \ E(y>| A 2 ) < K 2 (y>) < / ess sup f \ E(ip\A 2 ) 

V ess sup / / E(<p\A 2 ) " K 2 (y>) " \ ess inf / / E(^| A 2 ) ’ 

which by (3.8) imply that the image of K 2 has diameter 


6 < 2 log 


f ess sup / 
\ ess inf / 


< 00 . 


So by the Proposition 3.8 the map K 2 contracts Hilbert distances uniformly. Anal¬ 
ogously for Ki. On the other hand, the nonlinear maps Ii and I 2 preserve Hilbert 
distances. In particular T = Ki o I 2 o K 2 o / contracts Hilbert distances uniformly. 
By completeness we conclude that T has an unique fixed ray, so Lemma 3.7 allows 
us to conclude. □ 


As a byproduct of the proof above, the iterates under T of any initial ray converge 
with a known exponential rate to a fixed ray. This gives an effective algorithm for 
the computation of a Sinkhorn decomposition of / and of its scaling mean, in 
particular. 

3.5. The main example: direct products. Here we describe the particular form 
of the previous results that will be used in Sections 4 and 5. 

Let (X, X, /x), (y, y, v) be two probability spaces. Take (D, A , P) as the product 
space (X x Y,X x y, p x v), and take the two sub u-algebras on A\ = X x {0, Y}, 
A 2 = { 0 ,X} x y. 

An Ai-measurable function is just a function that only depends on the x coor¬ 
dinate, and analogously for ^-measurable functions. We denote by G(p), respec¬ 
tively B(n), the sets of positive measurable functions tp: X —* P ++ such that logy: is 
p,-integrable, respectively, /z-essentially bounded. Analogously we define sets G(y) 
and B{v). Recalling notations (3.2), (3.3), with a small abuse of language we can 
write 

Si = Q(n), B! = B(fi), G 2 = Q{y\ B 2 = B[y). 

In this context, the previously introduced concepts can be recast as follows: 
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• The scaling mean of a measurable function /: X x Y —» IR+ is: 

Sm(/) “ gmMgmW JJ mv) dfl(l) ■ (3 ' 9) 

i peQ(v) 

• The conditional expectations of a measurable function /: X x Y —» R are: 

E (/l-4i) = $/(•» 2/) dKs/) and E(/|^ 2 ) = J/(a:, •) d/x(x). 

• A function /: A' x Y —> R + is doubly stochastic if: 

§f(-,y)dp, = $/(a;,-)dzz = 1 for pi-a.e. x and p-a.e. y. (3.10) 

• A Sinkhorn decomposition of a function /: X x Y —» IR+ is a factorization 
of the form: 

fix, y) = g>{x)g{x, y)*l>(y) (3.n) 

where g is doubly stochastic, ip e G(pi), and if e G(v). 

In the case that X and Y are finite sets and the probabilities fi and v are 
equidistributed, we reobtain the concepts studied in Section 2; notice however that 
the definition of doubly stochastic matrices uses a different normalization. 

Let us remark that functional Sinkhorn decompositions as in (3.11) were first 
obtained by Knopp and Sinkhorn [KS] assuming that X and Y are compact and 
that / is continuous and strictly positive. For other existence results under various 
hypotheses, see [Nu2, BLN], 


Let us see a concrete non-trivial situation where functional Sinkhorn decompo¬ 
sitions and scaling means can be computed directly. The result below will be used 
in Section 5. 


Proposition 3.9. Suppose that f: X x Y —» 
tion of the form 


f(x, y) 


fo(x) 

h(x) 


R + is a measurable nonnegative func- 

ifycYo, 

ifyeY lt 


where {loj^i} is a partition ofY into sets of positive measure, and log(/o + fi) is 
g-integrable. Then 


sm (/) = c c 


1 


exp 


J log (/ 0 + r/i) 


dy, 


where c := p(Fo) o-nd r is the unique positive root of the equation 

fo 


I 


fo + rfi 


dpi = c. 


(3.12) 


(3.13) 


Proof. By dominated convergence, the LHS of (3.13) is continuous and increasing 
with respect to r, and so by the intermediate value theorem the equation has a 
unique solution r e (0,oo). Let ip := fo + r/i. Note that |logy> — log(/o + /i)| < 
|logr|, and therefore ip e G(pi). Let 


i>(y) '■= 


c)/r 


if y e Fo , 
if y e Yt. 


Direct calculation shows that the function g := f/((pif>) is doubly stochastic, so tpgif 
is a Sinkhorn decomposition of /. Proposition 3.5(a) then yields (3.12). □ 
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4. A Law of Large Permanents 


4.1. Statement and comments. In this section we prove our main result (Theo¬ 
rem 4.1), which was already stated in the Introduction. Let us recall the statement 
for the reader’s convenience, and also fix some notation. 

Assume that (X, X, y), (Y ,, y, v) are Lebesgue probability spaces, and T: X —* 
X, S:Y —*Y are measure preserving transformations. Given a function f: X x 
Y —» R, for each (a;, y) e X x Y and each positive integer n we define the following 
n x n matrix: 


Unf{x,y) 


/ f(x, y) 

f(Tx,y) 

■ f{T n ~ 1 x, y) 

f(x, Sy ) 

f(Tx, Sy) 

■ f(T n ~ 1 x, Sy) 

\f{x, S n ~ 1 y) 

f(Tx,S n ~ 1 y) ■ 

■ f(T n ~ 1 x, S n ~ 


(4.1) 


which can be thought as the truncation of an infinite matrix □/(#, y) e IR WxN . 

Recall that B(y x v) denotes the set of positive measurable functions on X x Y 
essentially bounded away from zero and infinity. Such functions have scaling means, 
given by formula (3.9). We can now restate our main result as follows: 


Theorem 4.1. If T and S are ergodic, and f e B(y x v) then 

lim pm (\Z\nf(x, y)) = sm(/) (4.2) 

n —>oo 

for y x v-almost every {x,y) e X x Y. 


This Law of Large Permanents is a very general ergodic theorem. In Section 5 
we will see how to apply it to other natural types of means, and in Section 6 we 
will discuss the possibility of even more general laws. We stress that Theorem 4.1 
not only states the existence of the limit in (4.2), but characterizes it as a scaling 
mean, which can be used to efficiently compute its value. 

It is worthwhile to note that Theorem 4.1 implies the generalized Friedland limit 
(Theorem 2.19), at least for positive matrices. Indeed, given A e take T and 

S as cyclic permutations on sets X and Y of cardinality fc, and let /i and v be the 
corresponding invariant measures. Let / be such that \Z\kf(x, y) = A for some point 
(x, y). Then, for every m ^ 1, the matrix \Z\kmf{x, y) is permutationally equivalent 
to the Kronecker product A(x)t/ m , where U m is the ?n x m matrix all of whose entries 
are 1. In particular, pm(A(x)t/ m ) equals pm(Dfc m /(a;, y)), which by Theorem 4.1 
converges to sm(/) as k —* oo. Using for instance a Sinkhorn decomposition of the 
matrix A and the related Sinkhorn decomposition of the function /, one checks 
that sm(/) = sm(A), thus obtaining the generalized Friedland limit (2.14). 

Actually, we will reason in the converse direction and deduce Theorem 4.1 from 
Theorem 2.19. Let us sketch the proof. The first step is to approximate / by a 
suitable simple function, and to show that the permanental means do not change 
much; the values of the simple function are recorded on a square matrix A. The 
second step is to show that the matrix (4.1) is, modulo a permutation of rows 
and columns, approximately equal to a Kronecker product A (x) U m , and then to 
use Theorem 2.19 to relate the permanental means of these matrices with scaling 
means. It is technically convenient to work with doubly stochastic functions, so we 
will also use Theorem 3.6 on the existence of Sinkhorn decompositions. 

The precise proof of Theorem 4.1 will take up the rest of this section, which 
is organized as follows: In § 4.2 we recall some basic results: approximation by 
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conditional expectations, and an ergodic theorem. In § 4.3 we obtain some estimates 
on how the permanent changes under perturbations of the matrix. These facts are 
used in § 4.4 to implement the strategy sketched above and to prove the theorem. 

4.2. Preliminaries from Measure Theory and Ergodic Theory. We have 
defined conditional expectations in § 3.2. The following result is contained in [Bo, 
Theorems 10.2.2, 10.2.3] and describes the behavior of conditional expectations as 
the er-algebra is refined: 

Proposition 4.2. Let (Cl,A,y) be a probability space and let f e L 1 (/x). Suppose 
that ( Ak ) is an increasing sequence of sub-cr-algebras of A whose union generates 
the a-algebra A modulo sets of measure zero. Then the functions E(f\Ak) converge 
almost surely and in L 1 to f as k —> oo. 

Let us describe the basic ergodic theorem that we will need. Consider probability 
spaces (Y,v) and measure preserving transformations T: X —» X , S: Y —» 

Y. Then we can define a measure preserving action of the semigroup N 2 (where 
0 e N) on the product space (A' x Y, y x u) as follows: 

(x, y) := {T l x,S^y), where ( i,j ) 6 N 2 and (x,y) e X x Y. 

Notice that this action is ergodic if and only if both T and S are ergodic. In 
this case, by the ergodic theorem for N 2 -actions (see [Ke, Theorem 2.1.5] or [Kr, 
Chapter 6, Theorem 3.5]), for any h e T 1 (/i x v) we have 

1 " _1 f 

lim — V* h(T l x,S-iy) = h d(/i x v) for y x v- a.e. (x,y). (4-3) 

n —>QO 77“ I 

i,j=0 

4.3. More preliminaries: Regularity estimates for the permanent of strictly 
positive matrices. Let us begin by recalling some basic facts. As for the deter¬ 
minant, the permanent of a square matrix A = ( aij ) e K nXn can be computed by 
means of a Laplace expansion along any column j: 

n 

per A = y aij per A(i\j), 

2=1 

where, as usual, A(i\j) denotes the matrix obtained from A by deleting the i-th 
row and the j-th column. Similar Laplace expansions along rows hold. Using either 
kind of expansion, we see that the partial derivatives of the permanent function are 
simply: 

= per A(i\j). (4.4) 

uCtij 

Given A > 1, let us say that a positive matrix A = ( ) e IR™ X " is X-bounded if: 

A -1 < atj =$ A for each i, j. (4.5) 

Lemma 4.3. Let A e R” xn be a X-bounded matrix and denote by 

s be the sum of the entries of A. Then for any i, j e {1,..., n) we have 
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Proof. The outer inequalities being trivial, we only need to care about the inner 
ones. Summing Laplace expansions of the permanent of A = (aij) along the n 
columns, we obtain rcper A = per A(i\j), and in particular, 

s min per A(i\j) ^ nperA ^ s max per A(i\j). 

i,j ij 

Suppose that the minimum and the maximum above are attained at (*i, ji) and 
(* 21 . 72)5 respectively. Consider the two Laplace expansions: 

perA(z 2 |/ 2 ) = J] ak A per A(z 2 fc|jij 2 )> 

k^i 2 

per A(i 2 |ii) = a k j 2 per A(z 2 fc|jij 2 ). 

Since a k j 1 ^ X 2 a k j 2 , it follows that A(i 2 |j 2 ) < A 2 per A(z 2 |ji). An analogous 
argument gives perA(z 2 |ji) < A 2 per A(ii\ji), thus proving that perA(?' 2 |/ 2 ) < 
A 4 per A(ii\ji). So for any i , j we have 

perA(z|j) < perA(z 2 |j 2 ) < A 4 per A(U|ji) < A 4 ns _ 1 perA, 

as claimed. The remaining inequality is proved analogously. □ 


We can now prove a regularity estimate for the permanent: 
Lemma 4.4. If A, Be are X-bounded matrices then 

per B 


log 


per A 


< 


A 4 ™ Si,.j- \K 


lin {Sij Si 


j K 


«- 2 > 

n 1 


1-3 




Proof. The second inequality being trivial, we only need to care about the first 
one. For all t e [0,1], the convex combination A t := (1 — t)A + tB is also a positive 
A-bounded matrix. To prove the first inequality, we will apply the mean value 
theorem to the function /(f) := log per A t and use the estimate 

per B 


log- 


per A 

Using formula (4.4), we compute: 

1 


= 1 /( 1 ) - /( 0 )| < max |/'(f)| . 

te[0,l] 


(4.6) 


per A : 

while by Lemma 4.3 we have 

per A t (i\j) 


/'(*) = — 7 - aij) per A t (i\j), 




< 


A 4 rz 


per A t " Sij(l ~ t) a ij + tbij 
Plugging these estimates into (4.6) we obtain the desired inequality. 


□ 


As another consequence of Lemma 4.3, we obtain the following estimate on how 
the permanental mean of a matrix varies when a row and a column are deleted: 


Lemma 4.5. There is a constant C > 1 such that for any A > 1 and any n ^ 2, 
if A e is a X-bounded matrix then for any i, j e {1,... ,n} we have 


log 


pm(A(z|/)) 


pm(A) 




6 log A + C log n 
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Proof. By Stirling’s formula, log(n!) = nlogn — n + Oflogn), and so 

log pm( A) = — log per A + log n — 1 + O ( 

n \ n 

Analogously, letting B := A{i\j), 

log pm(.B) = —-— log per B + logn —1 + 0 ( — 

n — 1 \ n 

These two estimates yield: 


, pm(B) 
° S pm(A) 


— log pm (B) 
n 

+ 

TL — 1 

-logpm(B) — logpm(A) 

n 


<c; 

— log pm (B) 
n 

+ 

1 per B 

!°g A 

n per A 



(*) (**) 


Since B is A-bounded, it follows from internality of the permanental mean that (*) < 
(logA)/n. On the other hand, by Lemma 4.3 we have (**) ^ (5 log A + logn)/n. 
The lemma follows. □ 

4.4. Proof of Theorem 4.1. 




4.4.0. Zeroth step: Some reductions. In order to prove Theorem 4.1, it is sufficient 
to consider functions that, in addition of being in B(g x v), are doubly stochastic, 
i.e., satisfy relations (3.10). Indeed, assuming the theorem already proved in this 
case, and given an arbitrary / e B(g x i/), we consider the Sinkhorn decomposition 
f(x,y ) = (p(x)g(x,y)i/j(y) as in (3.11) given by Theorem 3.6. Then, by row-wise 
and column-wise homogeneity of the permanental mean, 

/ n—1 \V" / n—1 X 1 /" 

pm (□„/(£, y)) = ^Y[ tp(T l x)J ^nV’(S' J y)J pm (U n g(x, y)) . 

Since T is ergodic, by Birkhoff’s theorem the first factor on the RHS converges to 
the geometric mean (3.1) for g-a.e. x e X. Analogously for the second factor. Since 
we are assuming Theorem 4.1 already proved in the doubly stochastic case, the third 
factor converges to the scaling mean of g, which by Proposition 3.3 equals 1. In 
conclusion, pm(Q„/) converges a.e. to gm(i^) gm(i)), which by Proposition 3.5(a) 
equals sm(/). 

So from this point on we assume that 

/ e B(g x v) is doubly stochastic, (4.7) 

and our aim is to show that the permanental means of the matrices (4.1) converge 
a.e. to 1 . 

As a second reduction, it is sufficient to prove Theorem 4.1 assuming that the 
measures g and v are non-atomic. Indeed, fix an arbitrary non-atomic Lebesgue 
probability space ( Z , 6) and an ergodic measure preserving transformation U: Z —> 
Z , and consider the product spaces {X,g) '■= {X x Z, g x 6) and (Y, v) := (Y x 
Z, v x 6). Then the transformations T{x,z ) := ( Tx,Uz ) and S(y,w) := ( Sy,Uw ) 
preserve the measures g and u, respectively, and are ergodic. Given a doubly 
stochastic function / e (log L rx> )(g x p), we consider the doubly stochastic function 
f(x,z,y,w) := f(x,y) on X x Y. Since the measures g and v are non-atomic, 
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and assuming that Theorem 4.1 is already proved in this case, we conclude that 
the permanental mean of the matrix \Z\ n f{x,z,y,w) converges to 1 at /t x i>-a.e. 
(. x,z,y,w ). But the latter matrix obviously equals □„/(£, j/), so we obtain (4.2). 

So we can assume that the Lebesgue spaces ( X , p) and (Y, v) are non-atomic. 
Actually, for convenience in the following proof, we will actually assume that 

X = Y = [0,1] and y = v is Lebesgue measure. (4.8) 


4.4.1. First step: Discretizing f. Fix an arbitrary e > 0. Let us construct a conve¬ 
nient discretized approximation of the given function /. 

Let k be a large positive integer (to be specified later), and define a positive 
matrix A = (a pq ) e IR+xk by 

rp/k nq/k 

a pq :=k 2 \ f(x,y)dydx. 

J(p-l)/k J(q-l)/k 

As a consequence of (4.7), the matrix fc -1 A is doubly stochastic. Indeed, for every 
p e {1,..., k] the sum of the corresponding row of A is 

fc rp/k pi 

V a pq = fc 2 f(x, y)dydx = k, 

a=l J(p-i)/k Jo 

V* 

1 

and in an analogous way we compute the column sums. In particular, by the 
homogeneity of the scaling mean and Proposition 2.4, we have sm(A) = 1. 

Let g be a function on X x Y = [0, l ] 2 equal to the constant a pq on each sub¬ 
square \(jp — 1 )/k,p/k) x [(g — 1 )/k, q/k ), where p, q e {1,..., k}. Notice that g is 
nothing but the conditional expectation of / with respect to the a -algebra generated 
by the partition into these sub-squares. Therefore, by Proposition 4.2, if k is chosen 
large enough then / and g are lA-close in the sense that 

Jl/ —ff|d(M x v) < e. (4.9) 

We fix the integer k and therefore the matrix A and the function g from now on. 
Since / e B{y x u), there exists A > 0 such that 

A -1 ^ A /ix v-a,.e. 


The values of g are obtained by averaging the values of / and therefore satisfy the 
same bounds, that is, A -1 ^ g ^ A. In particular, the matrices □„/(&, y) and 
\ZJn.g{x, y) are A-bounded for a.e. (x,y). We now use Lemma 4.4 to compare their 
permanental means: 


lo pm(D n f(x,y)) 
b pm(n„g(x, 2 /)) 


log 


per(Q n f(x,y)) 


per(D n g{x,y)) 




A 5 


2 IfiT^SJyygiT^SJyJl 


i,j=0 


Using the ergodic theorem (4.3) and the bound (4.9), we conclude that for a.e. 

{x,y), 


lo pm(D n /(x,y)) 
8 pm(D n g{x,y)) 


< A 5 e. 


lim sup 


(4.10) 
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4.4.2. Second step: Comparing \Z\ n g with a Kronecker product. Let us fix (x, y) such 
that (4.10) holds, the T-orbit of x visits each of the intervals [0,1/A;], [1/A;, 2/A;], 
..., [(A; — 1 )/k, 1] with limit frequency 1/A;, and analogously for the S'-orbit of y. 

For each n ^ 1, since the points x and y are not periodic, there exist permutation 
r n and <t„ of the set {0,1 ,..., n — 1} such that 

T^(o) x < x < < T T »< n - 1 >*, S an(0) y < S a *Wy < < S^-^y. 


Let P Tn be the n x n permutation matrix whose l’s are located on the positions 
(i, r(i)), ie {0,1,..., n — 1}. Analogously define P CTn . Consider the product matrix 

B n := P Tn ■ Un9{x,y) • P" 1 = (g(T^x, y))^ Q . 

This matrix has a decomposition into blocks: 


( B n , n 

Bn,12 

Bn,lk\ 

B n ,21 

Bn,22 

’ -^n,2 k 

\B n ,kl 

B n ,k2 

Bn,kk J 


where the block B n pq has all its entries are equal to a pq , has width equal to the 
cardinality of {x, Tx, ••• ,T ri_ 1 a;} n [(p — l)/k,p/k], and has height equal to the 
cardinality of {y, Sy, • • • , S n ~ 1 y } n [(q — 1)/A;, q/k\. It follows that if n = km for 
some sufficiently large integer m then the matrix B n and the Kronecker product 
A®U m differ at at most en rows and columns. Since both matrices are A-bounded, 
by Lemma 4.4 we have 


log 


pm(P n ) 


pm(4 g) Um) 


log 


per(P n ) 


per (A g) U m ) 


< — — (A — A 1 ) en 2 < A 6 e. 
n n 


Since the matrices B n and □«<?(:£, y) are by definition permutationally equivalent, 
they have the same permanental mean (by the row-/column-wise symmetry prop¬ 
erty). On the other hand, by the generalized Friedland limit (Theorem 2.19), we 
have 

lim pm(Ag) U m ) = sm(A) = 1. 

m —>oo 

So we obtain 

lim sup |log pm(Dt m j(a;, y))\ ^ A 6 e . 

m—>00 

Notice that, as a consequence of Lemma 4.5, 


.. . pm(D fc[ „ /fc j g(x,y)) 

lim lo S-—/—ir - 

-l — 00 pm(n n p(x,3/)) 


= 0 , 


and so 

lim sup | log pm(D n j(a;, y))| < A 6 e . 

n —>oo 

Recalling (4.10), we obtain 

lim sup |log pm(D„ f(x,y))\ < (A 6 + A 5 )e . 

n —>oo 


Since e > 0 is arbitrary, we infer that 


lim pm(D„/(i,j/)) = 1 . 

n —>oo 

This proves (4.2) under the assumptions (4.7) and (4.8). As explained in § 4.4.0, 
Theorem 4.1 in full generality follows. 
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5. Applications 


5.1. Symmetric means. The elementary symmetric polynomial of degree k in 
n ^ k variables is defined as 

E k (zi, z 2 , ■ ■ ■, z n ) := ^ z ii z i 2 - ■ ■ z ik . 


These sums appears in a wide range of different areas of mathematics. For example, 
Vieta’s formula states that if P(z) = z n + + • • • + a\z + ao is a monic 

polynomial and z \, ..., z n are its roots (repeated according to multiplicity) then 
a n -k = (~l) k E k (zi ,..., z n ). 

Assuming that z = (zi,..., z n ) is a string of nonnegative real numbers, we define 
its k-th symmetric mean as 

/ \ 

/ \ . I i, . . . , Z n ) \ 

sym ‘ w - [ o ) ' 

Notice that sym-, (z) is the arithmetic mean, and sym n (z) is the geometric mean. In 
the 18th century, MacLaurin discovered that synq^) ^ sym 2 (z) > ■ ■ • > sym n (x), 
thus generalizing the AM-GM inequality (see [HLP, § 2.22]). 

Symmetric means have the properties of reflexivity, monotonicity, internality, 
continuity, homogeneity, and, of course, symmetry. Actually, they can be expressed 
in terms of permanental means: we have 


sy m fcO) = [pm {R k (z))] n/k , 


(5.1) 


where 


Llk (z\ j • • • > %n) 


( Zi 

^2 

Z n \ 

Zi 

22 

Z n 

1 

1 

1 

V 1 

1 

' 1 J 


k rows 


n — k rows 


(5.2) 


Using this relation, we can deduce from the Law of Large Permanents (Theo¬ 
rem 4.1) an ergodic theorem for symmetric means. Actually, such result already 
exists and was obtained in 1976 by Halasz and Szekely [ S]: 


Theorem 5.1 (Halasz-Szekely). Let (X,p) be a Lebesgue probability space, T: X —> 
X be a measure-preserving ergodic transformation, g e and 0 < c < 1. Sup¬ 

pose k{n) is a sequence of integers satisfying 


1 < k(n) < n and 
Then, for /i -almost every x, the limit 


lim M 

n—>00 n 


= c. 


J^nsynifc^) (g(x),g(Tx),... ,g{T n 1 x)) 


exists and equals 


sym c {g) := c 


1 — c 


exp 


J l°g(ff + r) 


d/i 


(5.3) 
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where r = r(c) is the unique positive root of the equation 


I 


9 

g + r 


dp, 


= c. 


Proof. Let Y = {0,1} N , let v be the Bernoulli measure with weights c, 1 — c, let 
S: Y -» 7 be the shift, and let {Yo, Yi} the partition of Y into the cylinders of 
length 1. Consider the function 


f(x,y) 


| g(x) if V £ Yo , 
[i if y e y,. 


For /r-a.e. x and y, the conclusion (4.2) of Theorem 4.1 holds, and moreover 

if £{n) denotes the cardinality of the set {y, Sy ,..., S n ~ 1 y } n Yo then l(n)/n —> c 
as n —» oo. Using notation (5.2), define matrices 

An ■= Re( n )(g(x),...,g(T n ~ 1 x)), B n := R Kn )(g(x), ■ ■ ■ ,g{T n ~ l x )). 

Notice that A n can be obtained from E3nf(x,y) by permuting rows. In particular, 


lim pm(A„) = lim pm(□„/(», y)) = sm(/). 

n —>oo n —>oo 


Let A > 1 be such that A 1 < g < A. Then the matrices A n and B n are A-bounded, 
and so by Lemma 4.4, 


log 


pm(5 n ) 


pm(A„) 


log 


per(U n ) 


per(A n ) 


1 A 5 


s:-(A —A 1 ) n\£(n) — k(n)\ < 

n n 


A 6 1 f(n) — k(n) \ 


which converges to 0 as n —> oo. So pm (B n ) —» sm(/), and therefore by (5.1) we 
conclude that sym fe ( n ) (g(x), ■. ■, g{T n ~ 1 x)) —» [sm(/)] 1 ' c . We conclude the proof 
using Proposition 3.9 to compute sm(/). □ 


Let us compare the result above with that of Halasz and Szekely’s paper [HS]. 
The theorem from that paper requires only a weak integrability condition, namely, 
$log(l + g) d/i < oo. The theorem is stated in terms of independent identically 
distributed random variables, but the proof actually does not use independence, 
and ergodicity suffices. Therefore, the actual Halasz-Szekely theorem is stronger 
than Theorem 5.1 above. This indicates that a weakening of the hypotheses of 
Theorem 4.1 should be pursued (more about this on Section 6 below) and should not 
be regarded as an inherent drawback of our approach. Let us mention that the proof 
in [HS] employs completely different tools, namely: Vieta’s formula and Cauchy 
integral formula are used to relate the means with a certain complex integral, and 
then the saddle point method is used to estimate the value of the integral. This 
line of argument has been used in most of the probability papers on the subject. 
Our methods, on the other hand, provide a more transparent explanation for the 
complicated formula (5.3), and apply to much more general types of means. A 
second example of application is given in the next subsection. 


5.2. Muirhead means. Let z = (z i, ■ ■ •, z n ) and a = (or,..., a n ) be two strings 
of nonnegative numbers, not all the afs being 0. We then define the Muirhead 
a-mean of the zf s as 

/ 1 m 

mum Q (z) : = hp Z YlCti) 

\ aeS n i = 1 
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where S n denotes the set of permutations of {1,... ,n}. Note that this coincides 
with the arithmetic mean if a = (1,0,...,0), and with the geometric mean if 
a = (1, 1 , ..., 1 ). Muir head means also generalize the symmetric ones; indeed 

sym fc (z) = mum Qfc (z) where a k := (l, 1 ,..., 1 , 0, 0,..., 0) . 

v v 

k n—k 


The celebrated Muirhead Hardy-Littlewood Polya inequality [HLP, § 2.18] de¬ 
scribes when the functions mum a (-) and mum^-) are comparable; this is done in 
terms of a concept called majorization, which appears in a vast number of other 
situations. 


Muirhead means can also be expressed in terms of permanental ones: 


mum a (^) = [pm(M a (z))]“i+ ■+”» where M a (z) := 


(4 


c 

\~i 


Therefore we can deduce from the Law of Large Permanents an ergodic theorem 
for Muirhead means, namely: 


Corollary 5.2. Let (X, p), (Y,v) be a Lebesgue probability spaces, T: X —* X, 
S: Y —+ Y be measure-preserving ergodic transformations, and g e B(p), h e B(v). 
Then, for p x v-almost every (x,y), the limit 

r lm o mum (/ l ( 3/ ),...,/ l (S"- 1 2 /)) [g{x),g{Tx), ... ,s(T n_1 a;)) 
exists and equals 

mum h (g) := [srn (g h )] 1 ^ hdl ' . (5.4) 

Remark 5.3. Formulas (5.3) and (5.4) can be though as “continuous” (i.e., func¬ 
tional) versions of the symmetric and the Muirhead means, respectively. 


6. Open questions and directions for future research 

Permanents and Sinkhorn decompositions also make sense for multidimensional 
matrices and functions (see e.g. [Bal]). We believe that most of the results of this 
paper can be extended accordingly, but we have not checked the details. 

After having obtained a law of large numbers, a natural step is to look for a 
central limit theorem. In the particular case of symmetric means covered by [i IS] 
and Theorem 5.1 above, a central limit theorem was obtained by Szekely [Sz], under 
the assumption of independence. 

Concerning the Law of Large Permanents itself, we do not believe that the state¬ 
ment of Theorem 4.1 is the optimal one. We would like to weaken the hypothesis 
that / is (essentially) bounded away from 0 and oo. This assumption was used 
twice in the proof: first, to apply Theorem 3.6, and second, to use the regularity 
estimates for the permanent from § 4.3. So in order to strengthen the Law of Large 
Permanents we will probably need to solve other problems which are themselves of 
independent interest: 

• To obtain explicit necessary and sufficient conditions on the function / for 
the existence of a functional Sinkhorn decomposition / = (pgif in the sense 
of Definition 3.4, or at least in the more restricted setting of direct products 
(§ 3.5). 
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• To improve the estimates from § 4.3 so that in particular we are able to 
deal with matrices containing zero entries. 

The last and maybe most interesting line of research motivated by the results 
of this paper is to extend the Law of Large Permanents to infinite matrices whose 
entries form a N 2 -indexed stochastic process (or, in more dynamical terms, whose 
entries are the values of a given function along the orbit of an ergodic N 2 -action). 
Let us be more precise. 

Suppose that T is an ergodic measure-preserving action of the semigroup N 2 
on a Lebesgue probability space (SI, .4, P). Given a function / : Q —» R we define 
the n x n matrix □„/( u) := f . Let A\ and A 2 be the sub-u- 

algebras formed by the T( 1,( b-invariant and the T( 0,1 )-mvariant sets, respectively. 
We believe that the following statement should hold: 

Conjecture 6.1. If log / e then for P-a.e. w, 

lim pm(D„/(w)) = sm A u A 2 {f)- 

Conjecture 6.1 would follow from the next statement regarding sequences of 
matrices: 

Conjecture 6.2. Let ( A n ) be a sequence of matrices of increasing sizes n x n, all 
with row and column arithmetic means equal to 1 (i.e. — A n is doubly stochastic 
for each n .) Suppose that there exists A > 1 each matrix A n is A-bounded in the 
sense (4.5). Then 

lim pm(7l ra ) = 1. 
n—>oo 

It turns out that there are at least two claims in the literature at least as strong 
as Conjecture 6.2: see the references [G] and [ lc] . Unfortunately we are unable to 
verify the correctness of either. 


Acknowledgements. We thank M. Courdurier and M. Schraudner for helpful discus¬ 
sions about concavity and N 2 -actions, respectively and to V.L. Girko for sending 
us a copy of [G]. 
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